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This paper is concerned with onHne caching algorithms for the (n, fc)-companion cache, 
defined by Brehob et. al. PJ- In this model the cache is composed of two components: a k- 
way set-associative cache and a companion fully-associative cache of size n. We show that 
the deterministic competitive ratio for this problem is [n -f l)(fc + 1) — 1, and the randomized 
competitive ratio is O(lognlogfc) and ri(logn -I- log A;). 

Steve Seiden died in a tragic accident on June 11, 2002. The first named author would 
like to dedicate this paper to his memory. 

1 Introduction 



vQ ' There is a rapidly growing disparity between computer processor speed and computer memory 

cn ■ speed. Of prime importance in bridging this gap is the cache, the purpose of which is to allow 

^ I quick access to memory items that are accessed frequently. Since the cache is so important to 

f— ^ ' system performance, hardware designers have in recent years proposed a sequence of increasingly 

■^ ■ sophisticated cache designs (see e.g., |H1 El El)- Cache designs can be conceptually thought as 

having two parts: An architecture and a caching algorithm. The architecture describes the physical 



Y^ I structure of the cache such as its size and organization. The caching algorithm decides, for a 

given sequence of requests for memory items, which items are stored in the cache, and how they 
are organized, at each point in time. While there is a large body of theoretical work on caching 
kS ' algorithms for the simplest types of caches (which we refer to as fully associative), little theoretical 

j^ ■ work has been done on algorithms for more complicated cache architectures. In this paper, we 

address this deficiency by providing the first theoretical analysis of the (n, A;)-companion cache 
problem for k > 1. 

Problem Description: A popular cache architecture is the set-associative cache. In a k-way set- 
associative cache, a cache of size s is divided into m = s/k disjoint sets, each of size k. Addresses 
in main memory are likewise assigned one of m types, and the i'th associative cache can only store 
memory cells whose address is type i. Typically, there are m = 2* such types, where the j'th A;- wise 
associative cache is indexed by < j < 2* — 1 and memory addresses whose last i bits are equal to 
j are mapped to the j'th associative cache. Special cases includes direct-mapped caches, which are 
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1-way set associative caches, and fully-associative caches, which are s-way set associative caches. 
Ideally m should be small, but in order to maintain the high speed of the cache, k is usually very 
small. 1, 2 and 4-way caches are most commonly used. 

In order to overcome "hot-spots", where the same set associative cache is being constantly 
accessed, computer architects have designed hybrid cache architectures. Typically such a cache 
has two or more components. A given item can be placed in any of the components of the cache. 
Brehob et. al. |^ E] considered the (n, k) companion cache, which consists of two components: 
A k-way set associative called the main cache, and a fully-associative cache of size n, called the 
companion cache (the names stem from the fact that typically mk ^ n). As argued by Brehob 
et. al. [4j, many of the Ll-cache designs suggested in recent years use companion caches as the 
underlying architecture. Several variations on the basic companion cache structure are possible. 
These include reorganization/no-reorganization and bypassing/no-bypassing. Reorganization is the 
ability to move an item from one cache component to another, whereas bypassing is the ability to 
avoid storing an accessed item in the cache. A schematic view of the companion cache is presented 
in Fig. 
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Figure 1: A schematic description of a companion cache. 

Since maintenance of the cache must be done online, and this makes it impossible to service 
requests optimally, we use competitive analysis. The usual assumption is that any referenced item 
is brought into the cache before it is accessed. Since items in the cache are accessed much more 
quickly than those outside, we associate costs with servicing items as follows: If the referenced 
item is already in the cache then we say that the reference is a hit and the cost is zero. Otherwise, 
we have a fault or Tniss which costs one. Roughly speaking, an online caching algorithm is called 
r-competitive if for any request sequence the number of faults is at most r times the number of 
faults of the optimal offline algorithm, allowing a constant additive term. 



Previous Results: Maintenance of a fully associative cache of size k is the well known paging 
problem j5]. Sleator and Tarjan 11 proved that natural algorithms such as Least Recently Used 
are fe-competitive, and that this is optimal for deterministic online algorithms. Fiat et. al. j^, 
improved by McGeoch and Sleator ^U] and Achlioptas et. al. jlj , show a tight ks In A; competitive 



randomized algorithm, k-way set associative caches can be viewed as a cohection of independent 
fully associative caches, each of size k, and therefore they are uninteresting algorithmically. 

Brehob et. al. j3] study deterministic online algorithms for (n, l)-companion caches. They inves- 
tigate the four previously mentioned variants, i.e., bypassing/no-bypassing and reorganization/no- 
reorganization. 

Previous Results |Hj (only for main cache of size k = \): 
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Table 1: Summary of the results in and in this paper, for the (n, /c) companion cache. 



Our Results: This paper studies deterministic and randomized caching algorithms for a (n, fc)- 
companion cache. We consider the version where reorganization is allowed but bypassing is not. 
We show that the deterministic competitive ratio is exactly (n + 1)(A: + 1) — 1. For randomized 
algorithms, we present an upper bound of 0(lognlogA;) on the competitive ratio, and a lower 
bound of r2(logn + log A;). For the special case of /c = 1 that was studied in 0, our bounds on the 
randomized competitive ratio are tight up to a constant factor. The results of [S| and those of this 
paper are summarized and compared in Tabled 

We note that any algorithm for the reorganization model can be implemented (in online fashion) 
in the no-reorganization model while incurring a cost at most two times larger, and any algorithm 
for the bypassing model can be implemented (in online fashion) in the no-bypassing model while 
incurring a cost at most two times larger. Thus, the competitive ratio (both randomized and 
deterministic) differs by at most a constant factor between the different models. 

The techniques we use generalize phase partitioning and marking algorithms ^^. 

2 The Problem 

In the (n, A:)-companion caching problem, there is a slow main memory and a fast cache. The items 
in main memory are partitioned into m types, the set of types is T {\T\ = m). The cache consists 
of a two separate components: 

• The Main Cache: Consisting of a cache of size k for each type. I.e., every type t, 1 <t <m, 
has its own cache of size k which can hold only items of type t. 



The Companion Cache: A cache of size n which can hold items of any type. 



We refer to these components collectively simply as the cache. If an item is stored somewhere in 
the cache, we say it is cached. Our basic assumptions are that there are at least k + 1 items of 
every type and that the number of types, m, is greater than the size of the companion cache, n. 

A caching algorithm is faced with a sequence of requests for items. When an item is requested 
it must be cached (i.e., bypassing is not allowed). If the item is not cached, a fault occurs. The 
goal is to minimize the number of faults. A caching algorithm can swap items of the same type 
between the main and companion caches without incurring any additional cost (i.e., reorganization 
is allowed). 

We use the competitive ratio to measure the performance of online algorithms. Formally, given 
an item request sequence a, the cost of an online algorithm A on a, denoted by costA{cr), is the 
number of faults incurred by A. An algorithm is called r-competitive if there exists a constant c, 
such that for any request sequence o", 

£'[cost^((7)] < r ■ costopt(c) + c. 

To simplify the analysis later, we mention the following fact (attributed to folklore): 

Proposition 1. We may assume that Opt is lazy, i.e., Opt evicts an item only when a requested 
item, is not cached. 

3 Lower bounds on the competitive ratio 

Straightforward lower bounds follow from the classical paging problem. 

Theorem 1. The deterministic competitive ratio for the [n,k)- companion caching problem is at 
least (n + l)(/c+l) — 1. The randomized competitive ratio is at least H n,j^i\u^i\_i = VL(\ogn+\ogk) . 

Proof. Consider the situation where there are (n + 1) (A; + 1) items of n + 1 types, A: + 1 items of each 
type. In this case, a caching algorithm has (n + \)k + n = (n + !)(/;; + 1) — 1 cache slots available. 
Comparing this situation to the regular paging problem with a main memory of {n+\){k + l) items 
and a cache size of (n + l)(/c + 1) — 1, we find the two problems are exactly the same. A companion 
caching algorithm induces a paging algorithm, and the opposite is also true. Hence a lower bound 
on the competitive ratio for paging implies the same lower bound for companion caching. We 
conclude there are lower bounds of (n + 1)(A; + 1) — 1 on the deterministic competitive ratio and 
H(n+i)(k+i)-i = r2(logn + log A;) on the randomized competitive ratio for companion caching. D 

4 Phase Partitioning of Request Sequences 

In ini E] the request sequence for the paging problem is partitioned into phases as follows: A phase 
begins either at the beginning of the sequence or immediately after the end of the previous phase. 
A phase ends either at the end of the sequence or immediately before the request for the {k + l)st 
distinct page in the phase. Similarly, we partition the request sequence for the companion caching 
problem into phases. However, the more complex nature of our problem implies more complex 
partition rules. 

Let <T = cTi, (72, . . . , (T|o-| denote the request sequence. The indices of the sequence are partitioned 
into a sequence of disjoint consecutive subsequences Di, D2, . . . ,Df, whose concatenation gives 



Pi: The indices of the requests associated with phase i. 

Df. The indices of the requests issued during phase i. 

N{t): The indices of requests of type t that have not yet 

been associated with a phase. 
M(t) = {ae\e e N{t)} 

For every type t G T: M{t) ^ 0, N{t) ^ 
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Figure 2: Phase partition rules described as an algorithm. 



{!,..., |(t|}. The indices are also partitioned into a sequence of disjoint (ascending) subsequences 

PuP2,...,Pf. 

In Figure 121 we describe how to generate the sequences Di and Pj. Di is a consecutive sequence 
of indices of requests issued during phase i. Pj is a (possibly non-consecutive, ascending) sequence 
of indices of requests associated with phase i. Note that i £ Di does not necessarily imply that 
i € Pi and vice versa. What is true is that i €z Di implies either that i E Pi' for some i' > i, or 
i ^ Pi' for all i' . Note also that for all i, max Di > maxPj. 

Given a set of indices A we denote by I(^) = {cr£\£ £ A} the set of items requested in A, and 
by T(A) the set of types of items in I(^). 

Table [^ shows an example of phase partitioning. 

In IHj it is shown that any paging algorithm faults at least once in each complete phase. Here 
we show a similar claim for companion caching. 

Proposition 2. For any (online or offline) caching algorithm, it is possible to associate with each 
phase (except maybe the last one) a distinct fault. 

Proof. Consider the request indices in Pj together with the index j that ends the phase (i.e., 
j = minDj+i). One of the items in I{Pi) must be evicted after being requested and before aj is 
served. This is simply because the cache cannot hold all these items simultaneously. We associate 
this eviction with the phase. 
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Table 2: An example for an (n, A:)-companion caching problem where n = 3 and k = 2. The types 
are denoted by the letters a, b, c, d. The ith item of type j3 G {a, &, c, d} is denoted by /3j. Note that 
the requests for items di and d2 in this example are in P3, even though of they are issued during 
phases 1 and 2 (i.e., belong to Di and D2). 

We must show that we have not associated the same eviction to two distinct phases. Let ii 
and 12 be two distinct phases, ii < 12- If the evictions associated with ii and 12 are of different 
items then they are obviously distinct. Otherwise, the evictions associated with ii and 12 are of 
the same type t, and t G T[Pi^) n T{Pi^), which means that all indices i S Pjj, where a^ is of type 
t, must have £ > maxDj^. Thus, an eviction associated with phase 12 cannot be associated with 
phase ii. D 

To help clarify our argument in the proof of Proposition [^1 consider the third phase in Table [21 
Here l{Pz) = {&i, ^2, ^2,^4, ^5, ^i, ^2}, and the phase ends because of the request to ^3. It is not 
possible that all these items reside in the cache simultaneously and thus at least one of the items 
in f.{P'i) must be evicted before or on the request for item ^3. The item evicted can be either some 
bi, i G {1,2,3,4,5}, or some di, i G {1,2}. If, for example, the item evicted is some bi, then this 
eviction must have occurred after max Di — the end of the first phase — and therefore it cannot 
be an eviction associated with the first phase. 



5 Deterministic Marking Algorithms 

In a manner similar to 0| , based on the phase partitioning of Section [IJ we define a class of online 
algorithms called marking algorithms. 

Definition 1. During the request sequence an item e G {JtM{t) is called marked (see Figure 121 
for a definition of M{t)). An online caching algorithm that never evicts marked items is called a 
marking algorithm. 

Remarks: 

1. The phase partitioning and dynamic update of the set of marked items can be performed in 
an online fashion (as given in the algorithm of Fig. [21) . 

2. At any point in time, the cache can accommodate all marked items. 



3. Unlike the marking algorithms of [Hj, it is not true that immediately after max Di all marks of 
the ith. phase are erased. Only the marked items of types t £ T(Pj) will have their markings 
erased immediately after max Di. 

For a specific algorithm, at any point in time during the request sequence, a type t that has 
more than k items in the cache is called represented in the companion cache. Note that for marking 
algorithms, a type is in T(Pj) if and only if it is represented in the companion cache at max Di or 
it is the type of the item that ended phase i. 

Proposition 3. The number of faults of any marking algorithm on requests whose indices are in 
Pi is at most n{k + 1) + k = {n + l){k + 1) - 1. 

Proof. Each item e of type t requested in request index i £ Pi, is marked and is not evicted 
until after maxDj. We note that |T(Pj)| < n + 1 since at most n types are represented in the 
companion cache, and the type of the item whose request ends the phase may also be in T(Pj). 
Thus, |I(Pi)| < (n + l)/i: + n. D 

We conclude from Proposition |21 and Proposition |2j 
Theorem 2. Any marking algorithm is (n + 1)(A; + 1) — 1 competitive. 
Proof. Immediate from Proposition EJ and Proposition [3 D 

Since the marking property can be realized by deterministic algorithms, we conclude 

Corollary 4. The deterministic competitive ratio of the {n,k)- companion caching problem is (n + 
1)(A: + 1)-1. 

6 Randomized Marking Algorithms 

In this section we present an 0(lognlog A;) competitive randomized marking algorithm. The build- 
ing blocks of our randomized algorithms are the following three eviction strategies: 
On a fault on an item of type t: 

Type Eviction. Evict an item chosen uniformly at random among all unmarked items of type t 
in the cache. 

Cache-wide Eviction. Let T be the set of types represented in the companion cache, let U be 
the set of all unmarked items in the cache whose type is in T U {t} . Evict an item chosen 
uniformly at random from U. 

Skewed cache-w^ide eviction. Let T be the set of types represented in the companion cache, let 
T' C r U {t} be the set of types with at least one unmarked item in the cache. Choose t' 
uniformly at random from T' , let U be the set of all unmarked items of type t' , and evict an 
item chosen uniformly at random from U. 

Remarks: 
1. Type eviction may not be possible as there may be no unmarked items of type t in the cache. 



2. Cache-wide eviction and skewed cache-wide eviction are always possible, if there are no un- 
marked pages of types represented in the companion cache and no unmarked pages of type t 
in the cache then the fault would have ended the phase. 

The algorithms we use are: 

Algorithm TPi. Given a request for item e of type t, not in the cache: Update all phase related 
status variables (as in the algorithm of Figure |2). 

• If t is not represented in the companion cache and there are unmarked items of type t, use 
type-eviction. 

• Otherwise — use cache-wide eviction. 

Algorithm TP2. Given a request for item e of type t, not in the cache: Update all phase related 
status variables (as in the algorithm of Figure |2). Let the current request index he j £ Di, i > 1. 

• If t is not represented in the companion cache and there are unmarked items of type t, use 
type-eviction. 

• If t is represented in the companion cache, e E I(Pj_i), and there are unmarked items of type 
i, use type eviction. 

• Otherwise — use skewed cache-wide eviction. 

Algorithm TP. If A; < n use TPi, otherwise, use TP2. 

In the rest of this section we prove: 

Theorem 3. Algorithm TP is 0{logn\ogk) competitive. 

6.1 Basic Definitions and Proof Overview 

We give an analogue to the definitions of new and stale pages used in the analysis of the randomized 
marking paging algorithm of [B]. 

Definition 2. For phase i and type t, denote by i~* the largest index j < i such that t € T(Pj). 
If no such j exists we denote i~^ = 0, and use the convention that Pq = 0. Similarly, z"*"* is the 
smallest index j > i such that t G T{Pj). If no such index exists, we set i"*"* = "00", and use the 
convention that Poo = 0. 

Definition 3. An item e of type t is called new in Pj if e S ^(Pi) \ I(-Pj-0- We denote by gt^i the 
number of new items of type t in Pi. Note that if t ^ T(Pj) then gt^i = 0. 

Let iend denote the index of the last com,pleted phase. 

Definition 4. For t £ T{Pi), let Lt^i = I{Pi) D {items of type t}. Note that \Lt,i\ > k. Define 

^ ^ , ,^M, -k i< iend A t e TiPi) \ T{Pi+i), 
otherwise. 




We will use the above definitions to give an amortized lower bound (see Lemma |7|) on the cost 
of Opt of dealing with the sequence a: 

costopt(c^) > i Yl Yi9t,i + it,i-t)- (1) 

Our algorithms belong to a restricted family of randomized algorithms, specifically uniform type 
preference algorithms defined below. The main advantage of using such algorithms is that their 
analysis is simplified as they have the property that while dealing with requests aj, j E Di, the 
companion cache is restricted to containing only items of types in T(Pj) U T(Pj_i). 

Definition 5. A type preference algorithm is a marking algorithm such that when a fault occurs 
on an item of a type that is not represented in the companion cache, it evicts an item of the same 
type, if this is possible. 

Definition 6. A uniform type preference algorithm is a randomized type preference algorithm 
maintaining the invariant that at any point in time between request indices 1 + max D^-t and 
maxDj, inclusive, and any type t £ T(i-*j), all unmarked items of type t in I(Pj-t) are equally likely 
to be in the cache. 

Note that both TPi and TP2 are uniform type preference algorithms. 

We use a charge-based amortized analysis to compute the online cost of dealing with a request 
sequence a. We charge the expected cost of all but a constant number of requests in a to at least 
one of two "charge counts", charge(L>j) and/or charge(Pj) for some I < i < j < icnd- The total cost 
associated with the online algorithm is bounded above by a constant times X]i<j<j charge(Dj) + 
X]i<j<i charge(Pi), excluding a constant number of requests. 

Other than a constant number of requests, every request ag £ a has i £ Di^ U Pi^ for some 

1 < ii < ^2 < iend- 

We use the following strategy to charge the cost associated with this request to one (or more) 
of the charge(-Dj), charge(P;): 

1. Hi £ Pi and type(cj^) £ T(Pj)\T(Pj_i) then we charge the (expected) cost of a^ to charge(Pj). 
These charges can be amortized against the cost of Opt to deal with a^. This amortization is 
summarized in Proposition 121 (for any uniform type preference algorithm). 

2. If £ G Di and type(cr£) £ T(Pj_i) then we charge the (expected) cost of ag to charge(L>i). 
These charges will be amortized against the cost of Opt to within a poly-logarithmic factor. 
This amortization is summarized in Proposition ^J for algorithm TPi and Proposition 1151 for 
algorithm TP2. 

To compute the expected cost of a request o"^, (. £ Di, type(c7£) £ T(Pj_i), we introduce an 
analogue to the concept of "holes" used in 6 . In jHI holes were defined to be stale pages that were 
evicted from the cache. 

Definition 7. We define the number of holes during Di, hi, to be the maximum over the indices 
j £ Di of the total number of items of types in T(Pj_i) that were requested in Pi-i but are not 
cached when the jth request is issued. 



6.2 Analysis of the Competitive Ratio for Algorithm TP 
6.2.1 Lower Bounds on Opt 
Proposition 5. For any request sequence a, 

cost Opt(fT) >IY1 ^9t,i 

Proof. We may assume without loss of generality that Opt is lazy. Let Cj be the items in Opt's 
cache at the end of Pi (Co = 0). For pairs i,t, let G[ ^ be the set of new items in I{Pi) of type t 
that do not appear in Q-t, and let G'^^ be the set of new items in I(Pj) of type t that do appear 
in Ci-t. From the definitions, |G[ J + jG^'J = gt^i- 

First we show that costopt(<7) ^ Yli Ylt£T(PA \^ti\- -^°^ ^^V ^ ^ l'(-Pj) and for any item a £ G[^ 
we have a £ I(-Pi) \ C^-t. Thus, for any lazy algorithm, the first request for a in Pi is a fault. Let 
the request sequence o" = cri, cj2, . . ., we define 

J{Gi,t) = {j\j = mm{i\ae = a,ee Pi},ae G-^J, and J- = UteT{P,)J{Gi^t)- 

For any lazy algorithm, J'- is a set of request indices that result in faults. We are left to argue 
that Jj'^ n Jl^ = for ii / 12, but this is obvious, since J^ C Pj, and Pj^ n Pi^ = 0. 

Next, we show that costopt(<7) ^ J2iJ2teT(Pi) l^t'J- Note that G"-+t ^ Ci, i.e., items in G'^-+t 
are in Opt's cache after serving maxDj. As Opt is lazy, all items in G^'^+t must reside in the cache 
continuously since request index max Di-t . The slots used to store these items will be unavailable 
to deal with requests whose indices are in Pj. Consider the behavior of Opt on the request sequence 
a. We claim that Opt must have at least Ylt<^T(p) l^t'i+tl evictions of items that were requested in 
Pj, after their request, and before max Di. 

For every type t £ T(Pj) there were k+at requests to different items of type t in Pj, J2teT(Pi) ^t = 
n. The total memory that we have available to deal with these n + /c|T(Pj)| different items is no 
more than n + /c|T(Pj)| minus the number of slots that are unavailable, i.e., the number of slots 
available for requests whose indices are in Pj is no more than 

n + k\T{Pi)\- Y. |g;;,+.|. 

teT(P,) 

Thus, Opt must have evicted at least YlteJiPi) \G'li+t\ of them by the end of maxDj. 

To argue that we do not count the items in G'^-^t niore than once, we note that if t G T(Pjj) n 
T{Pi^) for n ^ i2 then i+* / i+*. D 

Proposition 6. For any request sequence a, 

cost opt(a) >IY1 IZ^*'^- 

Proof. Once again, we can assume Opt is lazy. Fix i < icnd) and a type t G T(Pj) such that t ^ Pj+i. 
Let 

^t,i = ^t,i \ Gj+i; L^^ = Lt^i n Cj+i. 
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Every item e £ L'^- has some i €z Pi such that ai = e, and e was evicted by Opt later (but before 
maxZ)j_|_i). Let i be the largest such I G Pi. We associate one eviction of e with index i. In this 
way every eviction is associated with at most one index. Indeed, the associated eviction occurs not 
before minDj, and before maxL'j+i. At this time frame, only items from Lt,i and it,j+i could have 
been associated with this eviction, but t ^ T(Pj4_i). Therefore costQp^(cr) > J2i YlteJiP-) \^ti\' 
Let t S T{Pi) \ T(Pj_|_i), and assume \L"-\ > k. Such items occupy at least 

4>i+i = Yl ™ax{|Lj'j| - A;,0} 

teJ{P,)\np,+i) 

slots in the companion cache at time maxDj+i. In -Pj+i there are requests for |T(Pj_|_i)|A;+n different 
items, but considering the cache at time maxDj+i, these items occupy at most |T(Pj+i)|A:+n — (/)j+i 
slots. This means that at least 0j+i of the items I(Pj+i), were evicted subsequently to being 
requested at request indices in Pj+i and no later than maxDj+i. 

Associate each such eviction of item a € I(Pj+i) with the largest index i G Pi+i such that 
ai = a. Note that each such eviction is associated with only one index, and therefore 

costopt((T) > Y^ Y ^^^{\L'li\-k,0}. 

i<icnd teT(Pi)\T{Pi+i) 

We conclude 

costopt(f^) > maxj ^ ^ jL^J, ^ ^ max{|L"i| - A:,0}| 

i<icndteT{Pi) i<icnd teT(Pi), 

^ ^E E max{|Lj,| + |L;;,|-A:,0} = i j; £i,,. D 

«<«end teT(Pi), i<icnd 

By taking a convex combination of the lower bounds of Proposition [S] and Proposition |H1 and 
by algebraic manipulations, we conclude: 

Lemma 7. For any request sequence a, 

cost opt(o-) > i Y Y^3t,i + ^t,i-i)- 

i<icnd t&Pi 

6.2.2 Upper Bounds on TP 

Proposition 8. Consider a marking algorithm, a phase i, a type t G ir(Pj) \ T(Pj_i), and a 
request index meoc Di-t < j < max Di. Let H be the set of items of type t that were requested in 
Pi-t and evicted afterward without being requested again, up to request index j (inclusive). Then 
\H\ < gt^i + ^t.i-'^) where gt^i < gt,i is the number of new items of type t requested after maxDj-t 
and up to time j (inclusive) . 

Proof. Recall that L^i-t is the set of marked items of type t after serving maxDj-t. Let Gt^i be 
the set of items requested after request max D^-t and before request index j that are not in L^ j-t , 
i.e., Gt,i is the set new items of type t requested up to request index j. 

11 



If i * = then H C L^^-t = 0. Otherwise, as H CI L^i-t C L^^-t U Gt^i, and k items of 
Li i-t U Gt,i are always in the (main) cache, we conclude 

\H\ < \Lt^i-t U Gt^i\ -k< {\Lt^i-t\ - k) + \Gt,i\ =it,i^t+gt,i. D 

Proposition 9. For a uniform type preference algorithm, the expected number of faults on request 
indices in Pi for items of type t G T(Pj)\T(Pi_i) is at most {l+Hn+k){gt,i+it,i-^) ■ I-e-; charge{Pi) < 

Proof. Fix a type t £ T{Pi) \ T(Pj_i). There are gt^i faults on new items of type t, the rest of the 
faults are on items in L^^-t that were evicted before being requested again. By Proposition |H1 the 
number of items in L^ j-t that are not in the cache at any point of time is at most gt^i + it.i-* — 
gt,i + ^t i-* ■ For any a, b in L^ j-t that have not been requested after max D^-t , the probability 
that a has been evicted since 1 + maxL'j-t is equal to the probability that b has been evicted since 
1 + max Dj-t . 

Let r denote the number of items in Li^-t that have been requested after max Dj-t . There are 
l-^t «-* I ~^ unmarked items of L^ j-t . The probability that an unmarked item of Lf. j-t is not cached 
is therefore at most {gt^i + ^t^j-t)/(|Lj j-t| — r). Thus, the expected number of faults on requests 
indices in Pi for items in Lj j-t is at most 

V f*'' '''" < {9t,i + A,i-0^|L, .-,1 < (au + it,i-^)Hn+k- □ 



t,i~ 



r=0 

The following proposition is immediate from the definitions. 
Proposition 10. A type preference algorithm has the following properties: 

1. During Di, only types in T(Pj_i) U T(Pj) may be represented in the companion cache. 

2. During Di, when a type t G T(Pj) \ T(Pj_i) becomes represented in the companion cache, 
there are no unmarked cached item,s of type t, and t stays represented in the companion cache 
until max Di, inclusive. 

Recall the definition of hi, the "number of holes during Di" (Definition [7|) . 

Proposition 11. For a type preference algorithm, 

hi< Y^ (ffM + 4,i-0+ 5Z 5t,(i-i)+*- (2) 

teT(PO\ir(Pz-i) teT(Pi_i) 

Proof. At time minDj, among the types in T(Pj_i)UT(Pj), only types in T(Pj)\T(Pj„i) may have 
uncached items from L^ i-t . By Proposition |HJ the number of such items at the beginning of Pj is 
at most X]teT(P)\T(P_ )(9i,« +4,j-*)' where ^(^j is the number of new items of type t requested until 
maxP)j_i (inclusive). 

Consider an eviction of an item of type in T(Pj) U T(Pj_i) during Di. The eviction must be 
caused by a request to an item of either the same type or a type represented in the companion 
cache. By Proposition IIUI the types represented in the companion cache are in T[Pi) U T(Pj_i), 
and therefore the type of the requested item is also in T(Pj) U T(Pj_i). If the requested item is 
an item of L^i-t, then the number of uncached items from L^i-t has not changed. Otherwise, it 
is a new item and thus the number of new items increases. In total, we have bounded hi as in 
Inequality ©. D 
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Definition 8. At any point during Dj, call a type t S T(Pj_i) that has unmarked items in the 
cache and is represented in the companion cache an active type. Call an unmarked item e G I(Pj_i) 
of an active type an active item. 

Note that an active item may not be cached. 

The following proposition is immediate from the definitions. 

Proposition 12. The following properties hold for type preference algorithms: 

1. During Di, the set of active types is monotone decreasing w.r.t. containment. 

2. During Di, the set of active items is monotone decreasing w.r.t. containment. 

Proposition 13. For TPi, charge{Di) — The expected number of faults on request indices in Di 
to types in T(Pj_i) — is at most /ij(l + Hk+i{l + iJ(„+i)(fe+i))). 

Proof. First, we count the expected number of faults on items in ^teflPi-x^^t.i-i- By Proposi- 
tion 1121 the set of active items is monotone decreasing, where an item becomes inactive either by 
being marked, or because its type is no longer represented in the companion cache. Let (w.j)j=i^,,,^^, 
be the sequence of numbers of active items indexed on the events. An event is either when an active 
item is requested, or when an active type t becomes inactive by being no longer represented in the 
companion cache (it is possible that one request generates two events, one from each case). 

If the jth event is a request for active item, then mj+i = ruj — 1. Otherwise, if the jth event 
is the event of type t becoming inactive, and before that event there were h active items of type t, 
then m-,_|_i = rrij — b. 

In the first case, the expected cost of the request, conditioned on rrij, is at most hi/mj. 

In the second case, there are b items of type t that became inactive, each had probability at 
most -^ of not being in the cache at that moment. This means that the expected number of items 
among the up-until now active items of type t, that are not in the cache, at this point in time, is 
at most ^. 

Let gt denote the number of new items of P(j_i)+t (Definition EJ requested during Di [gt < 
9tii-i)+t)- After type t becomes inactive, the number of items among Lt^i-i that are not in the 
cache can increase only when a new item of type t is requested. Therefore the expected number 
of items among Lt,i-i that are not in the cache, after the jth event (the event when t became 
inactive), is at most -^ + o*. 

Because of the uniform type eviction property of TPi, the probability that an item in Lt^i-i is 
not in the cache is the expected number of items among Lt^i-i, and not in the cache, divided by 
the number of unmarked items among Lt^i-i, and therefore the expected number of faults on items 
of Lt^i-i after the jth event is at most 

yr-~,hib 1 h^b 

Z^i — +9t)-- = — + gt)Hb. 

a=l ■' ■' 

Note that b < k + 1, and YlteP 9t — ^«' ^^^ ^^ ^^^ expected number of faults on items e € 
UtgT(P,_i)-^t,i-ii conditioned on the sequence {mj)j is at most 
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The sequence {mj)j is itself a random variable, but we can give an upper bound on the expected 
number of faults on items e G UteT(Pi^i)Lt,i-i by bounding the maximum of Eq. Q over all feasible 

sequences {'mj)j. The worst case for (j^J will be when {mj)j = {{n + 1)(A; + 1) — j)jLi 
Thus, 

hiHkj^i{l + 2^ — — ) < hiHkj^i{l + H(^n+i){k+i)) 

lilj 

j ■' 

We are left to add faults on new items of types in T(Pj_i). There are at most X^tgffp _ ) S't, («-!)+' — 
hi such faults. D 

We conclude 

Lemma 14. TPi is 0(log /cmaxjlogn, log/c}) com,petitive. 

Proof. Each fault is counted by either charge(Pi) (Proposition |^ or charge(L'j) (Proposition 113(1 
(faults on request indices in Di for items of type in T(Pj_i) \ T(Pj) are counted twice), and by 
Lemma [3 we have that the expected number of faults of TPi is at most 

(5(1 + Hr,+k) + 10(1 + Hk+i{l + i7(„+i)(fe+i)))) costQpt . □ 

For algorithm TP2, we have similar arguments. 

Proposition 15. For TP2, charge{Di) — The expected number of faults on request indices in Di 
to types in T(Pj_i) — is at most /ij(l + //„+fc(l + Hn+i))- 

Proof. Denote by A the set of active types at some point in time during Di. We claim that 
conditioned on the set of active types A, for any two active types ti,t2 S ^, the expected number 
of items in Lj^^j_i that are not currently in the cache is equal to the expected number of items of 
Li2,i-i that are currently not in the cache. 

We prove this by induction on the length of the request sequence. Before request index minDj, 
all items among the active types are in the cache, and the claim trivially holds. A fault on an item 
of Lt,i-i, not currently in the cache, of active type t, is served by type eviction and therefore the 
number of items from Lt^i-i and not in the cache does not change. A fault on an item of a type 
not represented in the companion cache that has unmarked items, is served by type eviction, and 
since that type is not active, it does not change the numbers of active items not in the cache. 

If type eviction is not used then the fault is served by increasing the number of items not in the 
cache among the active types. In this case a skewed cache-wide eviction is used, which chooses a 
page to evict in a two stage process, first choosing an active type uniformly at random, and then 
choosing to evict an unmarked page of that type uniformly at random. 

Given an active type t € A at some point in time i E Di, we use the following notation: 
ut denotes the number of items in Lt^i-i and not currently in the cache, and rj the number of 
items among Lt,i-i requested so far. Note that ut is a random variable. The probability that an 
active item of active type t is not in the cache, conditioned on ut = y, is ,^ ■ ^ i-r • Thus, the 
probability that an active item of type t is not in the cache is ^ ,^ ^ ,_,^ Pr[ni = y]. Note that 

Y^yyFi[ut = y] = E[ut]. 

Recall that the expectations E[ut] are all equal for active types t. Assuming there are a active 
types and that type t is active, 

^. , # active items not in cache hi 

E[ut\ = < — . (4) 

a a 
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Let bt be the number of active types immediately before type t became inactive. Thus, E[ut] < -^ 
immediately before type t becomes inactive. As of this point of time, Ut could increase only if a new 
item of type t is requested. We can therefore bound the value E[ut] < -j^ + 54,(1-1)+*, throughout 
Di. The expected number of faults on items of type t is at most 

^ 67 + g4,(»-l)+* _ [hi \ 

Using the facts that |Li,j_i| < n + k, and X^jgTr(P_ )9t,{i-i)+* ^ ^i) ^iid summing over all t E 
T(Pj_i), the expected number of faults on types in T(Pj_i) is at most 

/ ^ It- + 9t,(i-l)+t j HlL^-_-^\ < hiHn+k 2^ ^t +^t^n+k <hiHn+k{'^ + Hn+l)- 
iGT(P,_i) ^ * ^ teT(P,_i) 

We have bounded from above the expected number of faults on items in U4g']i'(P,_i)-^t,J-i- We also 
need to add at most hi faults on new items of types in T(Pj_i). D 

We summarize. 

Lemma 16. TP2 is 0(lognmax{logn, log A;}) competitive. 

Proof. Each fault is counted by either charge(Pj) (Proposition IH]) or charge(L'j) (Proposition 115(1 
(faults on request indices in Di for items of type in T(Pj_i) \ T(Pj) are counted twice), and by 
Lemma 13 we have that the expected number of faults of TP2 is at most 

(5(1 + Hn+k) + 10(1 + (1 + Hn+l)K^+k)) costopt • □ 

Proof of Theorem Follows immediately from Lemma El and Lemma 1161 D 

Unfortunately, the competitive ratio of a type preference algorithm is always r2(lognlog /c). 

Example 1. The following example proves that the competitive ratio of a type preference algorithm 
is always il(lognlog fc). Let A be a type preference algorithm. Let m = n + 1 and assume there 
are exactly k + 1 items from each type. In each Pi there is only one new item. At the beginning of 
phase i, minDj , the adversary requests all the items with the same type as the new item, and A 
incurs a cost of Hk+i- After that, A is forced to evict an item of a different type. The adversary 
chooses a type that has the hole in it with probability at least - and requests all the items of this 
type each time choosing the item with maximum probability of being a hole. This costs A 

111 1 Hk+i 

+ ^ + -T-, TT + ••• + - 



n{k + 1) nk n{k — 1) n n 

After that, the hole is in one of n — 1 types. Again, the adversary picks a type that has the hole in 
it with probability at least ^^^^ and requests all the items of this type each time choosing the item 
with maximum probability of being a hole, which costs A Hi^^i/{n — 1), and so on. In total, the 
expected cost for A for the phase is Hk+iHn. 
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7 Concluding Remarks 

We have shown that the deterministic competitive ratio for (n, A;)-companion caching is exactly 
(n + l){k + 1) — 1. We have also shown a lower bound of r2(logn + logk) and an upper bound 
of 0(lognlogA;) on the randomized competitive ratio. We conjecture that the lower bound we 
have proven is tight. Specifically, we conjecture that the following algorithm is 0(logn + logk) 
competitive. 

Algorithm CW: On a fault on item e of type t: let i > 1 be the current phase. If t ^ T(Pj_i), 
use type eviction if possible. Otherwise, use cache-wide eviction. 
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